Analysis of the algorithm: From kernels to backup genes.

Kernelization section

The algorithm transformed the semantic similarity matrix to make it compatible with a kernel. Once this was done for each network and kernel type, it was integrated by kernel type. Below there is a general analysis of the properties of each matrix in the different phases of the process.

Annotations properties

Table 1. Annotation files descriptors

Net Min Max Average Standard_Deviation
biological_process_sim 1 1053 8.255678448065316 20.730757162592834
cellular_component_sim 1 5172 7.219656420451649 82.56741928666138
disease_sim 1 81 1.7829472267615112 2.3512948371338793
gene_PS_sim 1 108 2.1519058295964126 5.045578888786698
gene_TF_sim 1 5778 3.0320060963993143 74.46796191948712
gene_hgncGroup_sim 1 2294 2.261042369703787 23.27548667222241
molecular_function_sim 1 6791 4.87167748867193 54.08093332133869
pathway_sim 1 479 6.75436035343834 16.179001796162975
phenotype_sim 1 1307 23.941396322320227 48.71098651648341
protein_interaction_sim 1 7338 610.0699285559645 500.6662722348227

Matrix properties

Table 2. Similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 16994x16994 288796036 256327240
cellular_component_sim 17963x17963 322669369 320852982
disease_sim 4162x4162 17322244 17243154
gene_PS_sim 3020x3020 9120400 95698
gene_TF_sim 1044x1044 1089936 172858
gene_hgncGroup_sim 25136x25136 631818496 14457654
genetic_interaction_effect_bicor_sim 17354x17354 301161316 202089222
molecular_function_sim 17333x17333 300432889 296559336
pathway_sim 3429x3429 11758041 159182
phenotype_sim 5077x5077 25775929 25715036
protein_interaction_sim 18476x18476 341362576 11271602

Table 3. Filtered similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 16994x16994 288796036 256327240
cellular_component_sim 17963x17963 322669369 320852982
disease_sim 4162x4162 17322244 17243154
gene_PS_sim 3020x3020 9120400 95698
gene_TF_sim 1044x1044 1089936 172858
gene_hgncGroup_sim 25136x25136 631818496 14457654
genetic_interaction_effect_bicor_sim 17354x17354 301161316 202089222
molecular_function_sim 17333x17333 300432889 296559336
pathway_sim 3429x3429 11758041 159182
phenotype_sim 5077x5077 25775929 25715036
protein_interaction_sim 18476x18476 341362576 11271602

Table 4. Uncombined kernel matrixes

Net Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process ct 16994x16994 288796036 288796036
biological_process el 16994x16994 288796036 288796036
biological_process ka 16994x16994 288796036 256344234
biological_process rf 16994x16994 288796036 288796036
cellular_component ct 17963x17963 322669369 322669369
cellular_component el 17963x17963 322669369 322669369
cellular_component ka 17963x17963 322669369 320870945
cellular_component rf 17963x17963 322669369 322669369
disease ct 4162x4162 17322244 17322244
disease el 4162x4162 17322244 17322244
disease ka 4162x4162 17322244 17247316
disease rf 4162x4162 17322244 17322244
gene_PS ct 3020x3020 9120400 8927548
gene_PS el 3020x3020 9120400 5842664
gene_PS ka 3020x3020 9120400 98718
gene_PS node2vec 3020x3020 9120400 9120400
gene_PS rf 3020x3020 9120400 5842664
gene_TF ct 1044x1044 1089936 1089910
gene_TF el 1044x1044 1089936 1062998
gene_TF ka 1044x1044 1089936 173902
gene_TF node2vec 1044x1044 1089936 1089936
gene_TF rf 1044x1044 1089936 1062998
gene_hgncGroup ct 25136x25136 631818496 631304330
gene_hgncGroup el 25136x25136 631818496 326932198
gene_hgncGroup ka 25136x25136 631818496 14482790
gene_hgncGroup rf 25136x25136 631818496 326932198
genetic_interaction_effect_bicor ct 17354x17354 301161316 301161316
genetic_interaction_effect_bicor el 17354x17354 301161316 301161316
genetic_interaction_effect_bicor ka 17354x17354 301161316 202106576
genetic_interaction_effect_bicor rf 17354x17354 301161316 301161316
molecular_function ct 17333x17333 300432889 300432889
molecular_function el 17333x17333 300432889 300432889
molecular_function ka 17333x17333 300432889 296576669
molecular_function rf 17333x17333 300432889 300432889
pathway ct 3429x3429 11758041 11744123
pathway el 3429x3429 11758041 8641125
pathway ka 3429x3429 11758041 162611
pathway node2vec 3429x3429 11758041 11758041
pathway rf 3429x3429 11758041 8641125
phenotype ct 5077x5077 25775929 25775929
phenotype el 5077x5077 25775929 25775929
phenotype ka 5077x5077 25775929 25720113
phenotype rf 5077x5077 25775929 25775929
protein_interaction ct 18476x18476 341362576 341362576
protein_interaction el 18476x18476 341362576 341362576
protein_interaction ka 18476x18476 341362576 11233250
protein_interaction node2vec 18476x18476 341362576 341362576
protein_interaction rf 18476x18476 341362576 341362576

Table 5. Integrated kernel matrixes

Integration Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
integration_mean_by_presence ct 32651x32651 1066087801 802388152
integration_mean_by_presence el 32651x32651 1066087801 576646501
integration_mean_by_presence ka 32651x32651 1066087801 361939949
integration_mean_by_presence node2vec 17830x17830 317908900 197103190
integration_mean_by_presence rf 32651x32651 1066087801 576646501
mean ct 32651x32651 1066087801 802387454
mean el 32651x32651 1066087801 576646501
mean ka 32651x32651 1066087801 361939949
mean node2vec 17830x17830 317908900 197103190
mean rf 32651x32651 1066087801 576646501

Weight values